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Abstract 



We prove that q-ary sparse codes with small bias are self-correctable and locally testable. We gen- 
eralize a result of Kaufman and Sudan [3] that proves the local testability and correctability of binary 
sparse codes with small bias. We use properties of q-ary Krawtchouk polynomials and the Mc Williams 
identity -that relates the weight distribution of a code to the weight distribution of its dual- to derive 
bounds on the error probability of the randomized tester and self-corrector we are analyzing. 

1 Introduction 

We consider the problem of error correction and detection for codes over large alphabets. Let C C F™ 
be a linear code. The minimum distance of C, denoted 5(C), is defined as 5 = mm x ^ ye c 5(x, y), where 
5(x, y) is the fractional Hamming distance between x and y. Let w be a word in F™ and let 5(w, C) = 
min x6 (7 6(w, x) denote the distance of w to C. C is said to be /(n)— sparse for t > if \C\ < f(n). C is 
said to be e-biased if, for all x ^ y G C, 1 — ~ — e < 5(x, y) < 1 — - + e. 

We prove that sparsity and small bias are sufficient conditions to test membership of w in C or to find the 
closest codeword in C to w, while querying only a constant number of symbols from w. Such codes are 
called locally testable and self-correctable codes. The following definitions are adapted to q-ary codes from 
U3- 

Definition 1.1. Let C C F™ be a linear code. C is said to be strongly k-locally testable if there exists a 
constant e > and a probabilistic algorithm T called the tester, that given oracle access to a vector v G F™, 
queries the oracle at most k times and accepts every v € C with probability 1 and rejects every v ^ C with 
probability at least e . 5(v, C). 

Definition 1.2. Let C C F™ be a linear code. C is said to be k-self correctable if there exist constants 
t > and < e < | and a probabilistic algorithm SC called the self corrector, that given oracle access 
to a vector v € F" that is t— close to a codeword c € C and an index i E [n], queries the oracle at most k 
times and computes c\ with probability at least 1 — e. 

Note that self-correctable codes include locally-decodable codes since self-correctability is independent 
of the encoder used and only uses the codewords themselves, while local decodability requires knowledge 
of the encoder and the messages. 
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1.1 Prior work and motivation 



Dual-BCH codes are an important family of sparse unbiased codes and have a major role in Coding Theory. 
Proving local testability and self-correctability for these codes will add to their powerful properties, thus 
opening the door to new techniques in error detection and correction. 

Moreover, local testability and self-correctability have many applications in Complexity Theory. Although 
codes satisfying these properties are not efficient (the rate tends to as the block length tends to infinity), 
the local nature of the underlying algorithms proves to be very useful in constructing Probabilistically 
Checkable Proofs where one should accept correct proofs while only checking few locations in the proof. 

Hadamard codes were the first codes shown to be locally testable and self-correctable in H). Recently, 
Kaufman and Litsyn proved in [2] that almost-orthogonal codes are locally testable and self-correctable. 
Such codes include dual-BCH codes. Kaufman and Sudan generalized these previous results and proved in 
f3l that sparse random binary linear codes are locally testable and decodable. 

In m, Kopparty and Saraf proved that random sparse binary linear codes are locally testable and local list- 
decodable. They strengthen the results of Kaufman and Sudan by correcting in the high-error regime, while 
using simpler proofs. They reduce the problem of testing/decoding codewords to that of testing/decoding 
linear functions under distributions. In [5], they prove that sparse low -bias codes over any abelian group 
are locally testable. Although their results subsume the results of O (except for removing the small bias 
property in the local testability case), we got our results independently and before their work was published. 

1.2 Summary of results 

We generalize the techniques in Q to prove that q-ary sparse codes with small bias are locally testable 
and self-correctable. We follow the proof strategy of Kaufman and Sudan. We use properties of q-ary 
Krawtchouk polynomials and the Mc Williams identity to bound the weight distributions of duals of sparse 
codes with small bias. These properties of q-ary Krawtchouk polynomials are non-obvious and were only 
obtained after the very recent work on zeros of discrete orthogonal polynomials in O. Thus, extending the 
results of Q to q-ary codes requires a more detailed analysis of the underlying Krawtchouk polynomials 
and their properties. 

Using the derived weight distribution bounds, we get the local testability result: 

Theorem 1. Let F q be the finite field of size q and let C C F q be a linear code. For every t < oo and 7 > 0, 
there exists a constant k = fc 9 ,t l7 < 00 s.t. if C is n l -sparse and rT 1 biased, then C is strongly k— locally 
testable. 

To get the self-correctability result, we apply the weight distribution bounds on punctured codes, where 
we removed one or two symbols from the codewords of the original codes. Namely, we prove: 

Theorem 2. For every t < 00 and 7 > 0, there exists a constant k = k Tn such that if C C F q is a 
n* -sparse and -biased, then C is k-self-correctable. 

1.3 Organization of the paper 

In section 2, we derive properties of q-ary Krawtchouk polynomials and the McWilliams identity to bound 
the weight distributions of the dual code. In Section 3 and 4, we prove Theorems 1 and 2 respectively. 
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2 Weight distribution of duals of sparse codes 

We use the McWilliams identity to relate the weight distribution of a code to that of its dual. 
2.1 Q-ary Krawtchouk Polynomials and MacWilliams Identity 

Let q, k and n be positive integers (k, q < n). Q-ary Krawtchouk polynomials Pk{i, q, n) are orthogonal 
polynomials on i = 0, . . . , n, with respect to the measure (™) (q — 1)\ They are defined as follows: 

From now on, we will drop the q and n from the definition of P k , when it's obvious. 

2.1.1 Properties 

1. P fc (0) = (£)(</- 

2. For every i, P k (i,q,n) = P k (n - i, ■^ l ,n)(l - q) k . 

3. Pk{i) has k real roots lying between (1 — |)n — k(l — ~) — — l)/c(n — k) and (1 — |)n — 
k(l-\) + y(q-l)k(n-k). 

4. Let /X! = (1 - \) n - k(l - f ) - f y/(q - l)k{n - k) and /x 2 = (1 - ±)n - fc(l - §) + 
2^(g-l)fc(n-fc). 

a. < - \)n - i] k , for all i G [n]. 

b. |P fc (i)| < £ [fc + §(V(<7 " 1)^ " fc) " ^)] fe , for Mi < » < A*2- 
c -Pfc(i) < 0, for /U2 < i < n and odd k. 

Proof. 1 and 2 follow from the definition of P k . 3 is from Theorem 6 of [61. 4 follows from 3 and basic 
manipulations of P k . □ 

Let C be a linear code of length n over F g . For every i € [n], let £?p be the number of codewords in C 
of weight i. The weight distribution of C is given by the vector < Bq = 1, . . . , >. Let C denote the 
dual code of C. 

Theorem 3 (MacWilliams Identity). For a linear code over F q of length n, B^ = ^ Y17=0 BfPk(i>, <Zi n), 
where Pj(i, q, n) is the generalized Krawtchouk polynomial of degree k. 

The following proposition lists some properties of a linear code C. 

Proposition 4. /fj]/ Let C be an n 1 — sparse linear code code over F q with 5(C) > 1 — ~ — n~ 7 , for some 
t, 7 > 0. Then: 
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Bf = Ofor alii G {!,...,(!- \)n - n 1 " 7 }. 



• IfCis n~ 7 biased, then = Ofor all i G {w(l — |) — n 1-7 , . . . , n}. 
The following is the equivalent of Claim 3.4 in [3] for q— ary linear codes. 
Claim 5. For every > 0, c,t < oo, if k > (t + c+l)/^, then for any n l —sparse set S C F™, then: 



(l-l/q)n+n 1 —' 
i=(l-l/g)n-n :l —i' 



o(n- c )P fc (0). 



Furthermore, ifk is odd, we have X^r=(i-i/g)n~n 1 -^ 

B?F%(i) = o(n- c )P k (0) 



Proof. 



(l-l/^n+n 1 -^ 
i=(l-l/q)n-n 1 -f 



(l-l/<2)n+n 1-7 
i=(l-l/q)n-n 1 -~f 

< ma Xl \P k {i)\Y,Bf 



/c! 

= o(n- c )P fe (0) 



The third inequality follows from applying property 4.a of Krawtchouk polynomials for (1 — 1/ q)n— n 1-7 < 
i < (1 — l/(/)n + ra 1-7 . The first equality follows from the facts that k > (t + c + l)/7 and |5| < n*. 
The second part of the claim follows from the fact that Pk{i) < for every i G {(1 — i)n + n 1-7 . . . n}. □ 

We will use the above claim to bound the weight enumerators of C L . The following lemma is the 
equivalent of lemma 3.5 in Q. 



Lemma 6. Let C be an n l -sparse code in F™ with 5(C) > 1 — - — n 7 . Then, for every c,t,~/ > 0, there 
exists a k^ s.t. for every odd k > k$, B^ < P ^ (1 + o(n~ c )). If C is n~ 7 -biased, then for every (odd 



and even) k > k$, B^ 



«(l + 0(n- c )). 



Note As mentioned in |3], the notation /(n) = g(n) + 9(n) means that, for every e > and for large 
enough n, g(n) — eh(n) < f(n) < g(n) + eh(n). 
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Proof. By Mac Williams Identity we get: 

1 n 



Pfc(0) + ^EM 



|C| \c\ 

11 1 1 1=1 



+ Jr\ E s f flfcW. 



I CI ICI 

i=(l-i)n-n 1 -T 



= + J^(o(n- c )P k (0)), using claim 1 

= (W)). 

where the third equality follows from 6(C) > 1 — - — n~ 7 . The second part of the lemma, when C is 
n~ 7 -biased: 



fc ~~ IcT lei ^ 



i 

(l_I) n+n l-7 

£ 

i=(l— i)n— n 1 



= ^ + p(^ n ") P ^ ))' ^ing claim 1 
= ^(l + ^- C ))- 

where the second equality follows since C is ra~ 7 -biased. □ 

3 Local Testing 

We will use a canonical tester that uses codewords in the dual of a code to test membership of words in the 
code. The following tester is proposed in for binary codes: 

T v . 
1 k ■ 

• Choose y Eu [C^k, where [C ]f. is the set of codewords in C 1 - of weight k 

• Accept if and only if (y, v) = 



If v G C, T% accepts with probability 1. We want to estimate the probability that T% accepts v when 
v $ C. Following the same approach as in [3], we look at a new code that is the linear span of C and v, 

c|h, = (j*= (c + H- 
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B (c\\v)->- 

Proposition 7. for v C,T% rejects v with probability Rejk(v) = 1 k c± . 

Proof. If Tu accepts v, then (y,v) = 0, then y € B^W V ) j since Vx € C and p, E F q , (y,x) = Q and 
+ /iu) = + = 0. If y e , then (y,a: + //u) = 0, Vx € C,a e F g . □ 

We now show that Rejk(v) = £l(6(v, C)). 
We will need the following two lemmas. 

Lemma 8. [3 J For every k, for sufficiently large n and for every r < i, P^{rn) < (1 — r) fc -P/c(0). 

Proof. The proof is the same as in J3j, where we use the fact that Pfc(0) = Q) (g — l) fc and property 4. a of 
g-ary Krawtchouk polynomials. □ 

The following lemma is the equivalent of Lemma 5.4 in ll3l . 

Lemma 9. Let k,t,j be constants. Let 7' < 7/2. For sufficiently large n, let D be an n* -sparse code in P™ 
of distance at least 1 — | — n _7 . Le? 5 < | ant/ Ze? = max{(l — ^)n — n 1 ~" , '—5n, 5n} and b = (1 — |)n — 

n^'.Then jXaW)*? < 2(^ 2 + g)P fc (0). min { (1 - ^) fc - 2 , ((^V)^ + (^n -7 )*" 2 } ■ 
Proof. For a code with minimum distance n(l — -) — n 1-7 , the Johnson bound states that, for 



q> 

2 
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< nil — -) — n 1 7 / 2 , the number of codewords in a ball of radius i is at most -, — % 



Let mi = ^ g " ,y ? . Then, by the Johnson bound, Y?j=o — m *' ^ or a ^ * — ^ ^ e § et: 

A_ f„_i\fc_!L 



^ (g~l) fc f 9 A" , (g~l) fc A ( q \ k ( v 

k\ \ q-l ) k\ V q-l J 

Replacing m a by its value in the first term, we get 

{q-lf ( q \ k (q-l) k ( q \ k qn 2 q(q - if 2 ( q N k ~ 2 

n a m a < — n a x ^ < — n n a 



k\ v 9-1/ k\ v g-i y ^ n _j_ a y~ fci V 9-1 



9-1 



For the second term, note that m; — m,;_i < 2q - 3 . Hence, we get 
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(Q-l) k ( 1 -\\ ^ <r (g Z *)* f * V V« 2 



n 1 



i=a+l v * ' i=a+l v * ' ( n 



< 2g 2 n 2 (g-l) fc " ( q ^ fc - 3 

^ — fci — E [ n -—\ 

i=a+l v 

< ( b -a){n-— ja 

q 2 n 2 (q-l) k ( q ^ k ~ 2 

< : n a 



k\ V 9-! 
Combining the above, we get 



f n/ , DP 7AgK(g-if / ? V 
1^M*« < ^ ^ - 7— r a J 



Substituting for the value of a and using the bound w ^ " < 2P fc (0), we get: 

Eta < 2(g 2 + g)P fc (0). min I (l - ^5) "~ 2 ,(j^(5 + N *"* 

Using the convexity of the function /(x) = x fc_2 , we get: 

El a Pk(i)BF < 2tf + q)P k (0).mm |(1 - frS) 1 -*, {j^tf^ + ( ^y™" 7 ' ^ 
The lemma follows since for x,y, z > 0, min{x, y + z} < minjx, y} + z. □ 

Back to the weight enumerator of the dual code, we use the above two lemmas to prove the following 
lemma: 

Lemma 10. For every c, t < 00 and 7 > 0, there exists a k$ such that if C is an n 1 -sparse code of distance 
£(C) > 1 — I — n~ 7 and v E F™ is 5-farfrom C, then, for odd k > ko, 

Bf llv)± < (1 - 5 - + o(n- c )) P k (0) 

Proof. Let 7' = 7/2. We will prove the lemma for k$ = max{/ci, k2, 16(g 2 + q)}, where ki is chosen to 
be big enough so that Claim [5] applies and k 2 is the constant given by Lemma[6]as a function of t, c and 7. 
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By the Mac Williams Identity, we have 

By Lemma[6j we have < P ffi (1 + o(n" c )). Hence, it's enough to prove 

^ £ B ( (0 »fWi) < (1 - i + °(0)^. 

Applying ClaimUJto (C + ftv), we get EIL n (l-i)-ni-V ^"^fcW = o(n~ c ).P fc (0). Now, it suffices to 
prove 

n (i_I)_ n i-V 

£ Sf = (1 - 5 + o(n- c ))P k (0)- 

i=0 

Since 8(C) > 1 — ~ — n -7 , Bf +>1V = for every i = {0, . . . , n(l — ~) — n 1 " 7 ' — 5n}, except possibly for 
i = 5n (v G C + v and <$(«, C) < 5). If 5n < n(l - ±) - n 1 " 7 ', B c ^ av = 1. Thus, we have 

n(l-f)-nW b 

where a = max{n(l — -) — n 1-7 ' — Sn, Sn} and 6 = n(l — -) — n 1_7 '. Using the bound in|9]for these 
values of a and 6, we get: 

Zl=o~ lq) ~ nlW Bf + ^P k (i) < P fc (0)((l - + 2(g 2 + g) min {(1 - ^5)^, (^)^} + 
0(n" 7 '( fc - 2 ))). 

Now, we want to prove that (1 - S) k + 2(q 2 + q) min |(1 - ^r^)^ 2 , (^jj*)* 2 j < (1 - 5), for 
< S < 1/2. 

• For S < q and fc > 2 + -\, we have 

(1 - «)* + 2(g 2 + g) ( Ii V) fe ~ 2 < (1 - + 2(g 2 + g) (j^) 2 < 1 - $8 + ^S < 1 - S. 

• For g^jm < 5 < 1/2 and > 16(g 2 + g), we have 

< i and2(g 2 + g)(l-^ T 5) fc - 2 < Hence (l-<5) fe + 2(g 2 + g)(l-^) fc - 2 < ± < 1-5. 
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The lemma follows since k > * +c+ as in Claim [5] □ 

Finally, we prove the main theorem for local testability for small-bias codes over large alphabets. 
Proof of Theorem 1 Given t and 7, let k be an odd integer greater than ko as given by Lemma [lOl The 
rest of the proof is the same as the proof of Theorem 5.5 in (3). 

4 Self-correctability 

Below is the canonical self corrector for the code C that uses the dual code C^. This algorithm is used in 
[3j for binary codes. 

SCl(i) : 

• Choose y Gfy [C- 1 -]^, where [C- 1 ]^ is the set of codewords in C 1 - of weight k that have a non-zero 
value at index i. 

• Compute (-yi)- 1 E{,e[n]-{»}«.t w /o} v r 

SC^(i) have oracle access to v such that 5(v , c) < for every c G C. It makes /c — 1 queries to We 
want to estimate the probability that SC^{i) doesn't compute q. 

We'll start by estimating the probability that y £u [C ]& has non-zero entries at indices i and j. As in 

0, let Cj = {7T_j(c),c G C}, where 7r_j(ci, . . . , Cj_i, q, c i+ i, . . . ,c n ) = (ci, . . . ,c i - 1 ,c i+1 , . . . ,c n ). 

Proposition 11. [3] Let 7C3 (ci, . . . , Cj_i, Cj+i, . . . , c n ) = (ci, . . . , a-i, 0, Cj+i, . . . , c n ) and 7rI*(S) = 
{KZ\{y)\y e 5}. Tfcen [C-L] M = [C^-Tr" 1 ([(C^]*) am/ We, |[C-L] M | = |[^] fc | - |[(C-*)^]fc[- 

Proo/ If y e C 1 with y 4 = 0, then 7r_ 4 (y) € (C^ 1 )" 1 and hence {ir^(y)\y GC 1 |i/ i = 0}C (C^)" 1 . 
We'll show that (C^) 1 = {ir-i(y)\y £C ± \y i = 0}. 

(C- i ) ± is the dual of C~\ hence ((C^) 1 ] = = 2^ since 5(C) > §. Thus, |(C'- i ) ± | = i |- 

On the other hand, \{^^{y)\y € C^| Vl = 0}| > JlC^. Therefore [C-% = [C^ - ^([(C^W- 

□ 

Extending the puncturing to two indices i ^ j, let C^'^ be the projection of C on [n] — {i, j} and let 
7r„{j j j be that projection. Thus, we get: 



Proposition 12. ^ For every i ± j, [C 1 ]^^ = [C L ] k - ^([(C^W - irZ){[{C^) L \ h ) + 
^^([(C-^U Hence, \[C\ m \ = \[C^] k \ - \[(C^]„\ - \[(C-^] k \ + \[(C-^] k \. 



Proof. Same as in 0. □ 

Using what we know about weight distributions of the above dual codes, we derive the probability that 
yj ^ when y is chosen at random from [C ]k,i- 
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Lemma 13. For every 7 > and c,t < 00, there exists k such that for sufficiently large n,ifCC F™ is an 
rJ-sparse, n^" 1 -biased linear code, then for every i 7^ j € [n], the probability that yj 7^ 0, when y is chosen 
at random from [C^]^ is ^Et + 9(n~ c ). 





\C\ 


(9 


- if 




\C\ 




- If 



Proof. Pr yeu [ C ±^ k .[yj 7^ 0] = ^J^''^ . Using the above two propositions, we can calculate those two 

quantities. C, C~ % , C~i and C~^ h ^ are ra*-sparse, n~ 7 -biased codes, with respective block lengths n, 
n — 1, n — 1 and n — 2. Note also that they all have the same size. Picking k large enough to apply Lemma 
1 to these codes , we get: 

|[(^-W] fc | = 1 ^(( n ^ 2 )+^ fc - c ) 

Now we use Proposition [T2l to obtain: 



(q-l) k ((« \\ , iu ;V _, 



and 



(q- l) k f fn-2\ k- 1 
ICI \\k - l) n - k 



fc— c> 



Finally, 



□ 



Now, we prove the main lemma that bounds the error probability of SC%(i). 

Lemma 14. [Jfy For every t < 00, 7 > 0, there exists a k = kt rf < 00 such that: if C is an nt-sparse, 
n~' r -biased linear code in F™ and v 6 F™ is T-close to C, then for every i € [n], Pr[SC%(i) 7^ Cj] < 
fcr + 0(l/n). 

Proof. Pick large enough to apply Lemma [T3l with c = 2. Hence, for y Eu [C^k t and i 7^ j, Pi[yj 7^ 
]=j=i +*(„-*). 

Let E be the set of errors in v, i.e. i? = {j G [ n ]\ v j 7^ c j}- Since u is T-close to C, then \E\ < to. Let 
5 y be the set of non-zero symbols in y, i.e. S y = {j € 7^ 0}. SC^{i) will err only if the errors on 

v line up with the non-zero symbols of y, and hence only if E n S y ^ <p. Therefore, Pr [SCt(i) / c%] = 
Pi[E nS y ^4>}< \E\ max je£ Pr y [ yj ^ 0] < kr + ^(n^ 1 ). □ 

Since all the strings n we are considering are at distance less than ^r, we get the result of Theorem 2 for 
the self-correctibilty of sparse small-biased linear codes in F". 
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5 Conclusion and future work 

We proved that sparse codes with small bias over large alphabets are locally testable and self-correctable. 
We used properties of the generalized Krawtchouk polynomials and some basic results from coding theory 
like the McWilliams identity and the Johnson bound. The next step is to relax the small bias condition, while 
maintaining a good minimum distance. Kaufman and Sudan were able to remove the small bias condition 
for local testability in the case sparse binary codes with large distance in 0. Even in the binary case, 
removing the small bias condition for self-correctability of sparse codes with large distance is still an open 
problem. Moreover, the techniques used in [4] might be extended to get //sf-decodability of sparse q-ary 
codes. 
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